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Online product reviews have become a source of greatly valuable 
information for consumers in making purchase decisions and producers to 
improve their product and marketing strategies. However, it becomes more 
and more difficult for people to understand and evaluate what the general 
opinion about a particular product in manual way since the number of 


reviews available increases. Hence, the automatic way is preferred. One of 


the most popular techniques is using machine learning approach such as 
Keywords: Support Vector Machine (SVM). In this study, we explore the use of 
Word2Vec model as features in the SVM based sentiment analysis of product 
reviews in Indonesian language. The experiment result show that SVM can 
; : performs well on the sentiment classification task using any model used. 
Text classification However, the Word2vec model has the lowest accuracy (only 0.70), 
Word embedding compared to other baseline method including Bag of Words model using 
Word2Vec Binary TF, Raw TF, and TF.IDF. This is because only small dataset used to 
train the Word2Vec model. Word2Vec need large examples to learn the word 
representation and place similar words into closer position. 
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1. INTRODUCTION 

Since the rise of Web 2.0, the internet has become more user centric [1]. People are participating in 
making more and more content on the Internet through social media, discussion boards, Web forums, and 
blogs. Concurrently with such trends, an increasing number of websites where consumers can write and read 
reviews, and express their experiences, feeling, opinions, views, and complaints about various products and 
services has emerged [2]. From a consumer behavior perspective, it can be called as one of the greatest 
developments on the Internet. 

Online platforms has become a source of greatly valuable information for both consumers and 
producers. In making purchase decisions, consumers often seek advice and purchase recommendations from 
others [3-4]. Previously, consumers commonly refer to advertisements in mass media to make this 
decision [5]. However, with the growth of e-commerce and increasing number of online review platforms, 
online reviews have become a reference for consumers they can rely on in finding information about the 
product to be purchased [6-7]. Consumers tend to learn how others like or dislike a product before buying. In 
fact, previous research found that consumers believe that online reviews provided by other users are more 
credible and trustworthy than the traditional sources [8]. 

For producer, online reviews can become a reference about what people think about their products 
or services to predict public acceptance level of their products. This information can help to forecast product 
sales. Furthermore, negative reviews can be the basis in product improvement and marketing strategies [9]. 
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Therefore, understanding such sentiment and opinion information has become more and more 
prominent for both producers and customers. However, it becomes more and more difficult for people to 
understand and evaluate what the general opinion about a particular product in manual way since the number 
of reviews available increases. Hence, the automatic way is preferred. 

Sentiment analysis, also known as sentiment or polarity classification, is a work of analyzing 
people’s opinion or sentiment from a piece of text - for example to decide whether the sentiment is positive 
or negative [10]. We can consider sentiment analysis as text classification problem with sentiment as its 
classes. Nevertheless, sentiment classification is more challenging than traditional topic-based classification 
due to the necessity to extract more implicit information, instead of only keywords [11]. 

One of the most popular techniques is using machine learning approach. In recent years, sentiment 
classification using machine learning methods have been widely adopted and proven to provide supreme 
performance [12-17]. Prior research conducted by [10] also showed that machine learning techniques have 
quite good performance with SVMs tend to do the best. Two key issues in machine learning approach are 
how to extract complex features and finding out which kinds of features are more valuable [18]. Several 
feature extraction methods have been proposed such as single words [19-20], n-grams [21-22], lexicon [23], 
textual features [24], and many other new models [25-27]. However, semantic features have been 
infrequently employed in this field. Semantic features can disclose the implicit semantic relationships 
between words, which is should be useful for improving the sentiment classification performance. 

Word embedding, also known as distributed word representation [28], is feature learning technique 
in Natural Language Processing (NLP) where words from the vocabulary are represented to low-dimensional 
vectors of real numbers [29]. By using word embedding, the semantic and syntactic information of words can 
be captured from a large number of unlabeled corpora [30-31]. Word embedding have been employed in 
many works in Natural Language Processing (NLP) to produce more effective word representations [32-36]. 
One of the most popular example of word embedding is Word2 Vec model. Word2Vec [37] maps each words 
in the vocabulary into a dense vectors of real numbers using a shallow neural probabilistic language 
model [38]. By using Word2vec, words that similar will be close to each other in the embedding space [39]. 

In this study, we will explore the use of Word2Vec model for sentiment analysis of product reviews 
in Indonesian language. Word2Vec will be used as feature representation. For the classification task, we will 
use Support Vector Machine due its supreme performance. We will also explore the use of Bag of Word 
(BOW) model utilizing several term weighting methods including Term Frequency (TF) and Term 
Frequency-Inverse Document Frequency (TF.IDF). 


2. RESEARCH METHOD 

The general flowchart of the sentiment analysis system in this study is shown in Figure 1. There are 
three main stages in this system i.e. preprocessing, building Word2Vec model and classification using SVM. 
Each review will be classified into positive or negative class. 


| | Preprocessns | | 
Word2Vec Model 


Sentiment classification 
using SVM 


Sentiment 
Analysis Result 


Figure 1. System main flowchart 
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2.1. Preprocessing 

Preprocessing is conducted before the main process begin. Some steps conducted in this stage 
including tokenization, case folding and cleaning [40-43]. In tokenization, each review is splitted into smaller 
units called tokens or terms [44]. Case folding is a task of converting all of characters in review text become 
lowercase [45]. Meanwhile, in cleaning, characters outside of the alphabet such as punctuation, numbers, and 
html tag is omitted. In this study, stemming and filtering are not conducted because in some previous studies, 
stemming and filtering cannot improve sentiment analysis performance. 


2.2. Building Word2Vec model 

After the preprocessing stage was done, we build word vector representation using Word2Vec. First, 
the Word2Vec model builds a vocabulary from training data. Then, it learns and determines the vector 
representation of each words. There are two training algorithms in word2vec, i.e. continuous bag-of-words 
(CBOW) and skip-gram [46]. In this study, CBOW is employed. In CBOW, the word vector is built by 
predicting each word cooccurance based on its neighboring words. The resulting word vector will be 
employed as the classification features. Word2Vec generally can help to improve classification performance 
because in Wor2Vec, the similar words have similar vectors. 


2.3. Sentiment classification using support vector model 

Finally, in the last stage, the reviews are classified into positive or negative class. In this study, 
support vector machines (SVMs) is used for the classification task. Despite its high computational 
complexity [47], SVM has become a popular algorithm in the last decade because of its excellent 
performance in text classification field [48]. 

Based on the representation of training data in feature space, SVM finds a hyperplane that separates 
the positive and negative data with maximum margin. Then, the testing data are then mapped into that same 
feature space and predicted to belong to positive or negative category based on which side they fall. In this 
study, we use linear kernel because based on the work of Mc Callum and Nigam [49], linear SVM has the 
best performance in text classification. The other benefit of linear kernel is that it is faster and require fewer 
parameters than other kernels in SVM. 


3. RESULTS AND ANALYSIS 

Experiment is conducted by using 772 product reviews extracted from FemaleDaily website. The 
text reviews and their ratings were collected and labelled manually from the website 
(https://femaledaily.com/). There are 386 reviews labelled as positive and 386 reviews labelled as negative. 
All of the reviews is in Indonesian language. Scikit-Learn [50] was used to implement the experiments. In the 
experiments, we compared the results of sentiment classification using Word2Vec with the other methods 
including Bag of Words (BOW) using Binary TF, Raw TF, and TF.IDF. We use 10-fold cross validation, 
which means the product reviews dataset is equally divided into 10 folds. We iterate the experiment 10 times. 
In each iteration, reviews from 9 folds were used as training data and the remaining one-fold was used as 
testing data. Average accuracy was used as the evaluation method. Experiment results can be seen in 
Figure 2. 

Figure 2 depict that sentiment analysis using SVM generally have good performance with average 
classification accuracy value 0.81. The best result is obtained when using BOW features with TF.IDF by 
accuracy value 0.85. In the second place, BOW features with Binary TF have slightly diferent result with 
accuracy value 0.84. Meanwhile, BOW features with Raw TF comes in third place with accuracy value 0.83. 
Surprisingly, our proposed method has the lowest accuracy value, only 0.70. 
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Figure 2. Experiment results 
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The dataset used in this experiment can be said as small dataset. In a small dataset, Word2Vec 
cannot capture the the semantic and syntactic information of words very well. When Word2Vec learn the 
word representation, each word starts at random position in the vector space. The words will be moved closer 
into the position of words that similar to them gradually based on their neigbors in training data. If we have 
very large dataset, all the words can be arranged so that all those pairwise similarities are simultaneously 
upheld because it have so many varied examples to gradually moved them all into better positions. 
Otherwise, in small dataset, there are very few examples where the words that sould be similar are neighbors 
in training data. With very few examples where there are shared nearby-words, there's few bases for moving 
the all those pairwise similarities to the same position. Hence, Word2Vec cannot trained well using 
small dataset. 


4. CONCLUSION 

In this study, we used Word2Vec model to represent the features for product review sentiment 
classification in Indonesian language. We used SVM for the classification method. We also compared the 
Wor2Vec based classification performance with Bag of Words features using Binary TF, Raw TF, and 
TF.IDF. In general, SVM can performs well on the sentiment classification. However, the Word2vec model 
have the lowest accuracy value than other method. This is because we only have small dataset to train the 
Word2Vec model. Word2Vec need large example to learn the word representation and place similar words 
into closer position. Otherwise, in a small dataset, there too many examples to move the words into the 
better place. 

In the future work, we can use larger dataset to build the Word2Vec model. This dataset does not 
need to be labeled first as positive or negative. This dataset also does not need to be sentiment analysis 
dataset. We can use another dataset such as news, articles, wikipedia, and so on. 
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